Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments
نویسندگان
چکیده
In this paper, we propose several compensation approaches to alleviate the effect of additive noise on speech features for speech recognition. These approaches are simple yet efficient noise reduction techniques that use online constructed pseudo stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transforms for noisecorrupted speech features to make them closer to their clean counterparts. We apply these compensation approaches on various wellknown speech features, including melfrequency cepstral coefficients (MFCC), autocorrelation melfrequency cepstral coefficients (AMFCC) and perceptual linear prediction cepstral coefficients (PLPCC). Experimental results conducted on the Aurora-2 database show that the proposed approaches provide all types of the features with a significant performance gain when compared to the baseline results and those obtained by using the conventional utterance-based cepstral mean and variance normalization (CMVN).
منابع مشابه
加成性雜訊環境下運用特徵參數統計補償法於強健性語音辨識 (Feature Statistics Compensation for Robust Speech Recognition in Additive Noise Environments) [In Chinese]
In this paper, we propose several compensation approaches to alleviate the effect of additive noise on speech features for speech recognition. These approaches are simple yet efficient noise reduction techniques that use online constructed pseudo stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transforms for noise-corrupted speech features to...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کامل端點偵測技術在強健語音參數擷取之研究 (Study of the Voice Activity Detection Techniques for Robust Speech Feature Extraction) [In Chinese]
The performance of a speech recognition system is often degraded due to the mismatch between the environments of development and application. One of the major sources that give rises to this mismatch is additive noise. The approaches for handling the problem of additive noise can be divided into three classes: speech enhancement, robust speech feature extraction, and compensation of speech mode...
متن کاملEffect of the Environment on Speech Statistics
This paper describes a series of cepstral-based compensation procedures that render the SPHINX-II continuous speech recognition system more robust with respect to acoustical changes in the environment. The first two algorithms, SNR based MultivaRiate gAussian based cepsTral normaliZation (SNR-based RATZ) and STAtistical Reestimation of HMMs (STAR), compensate for environmental degradation based...
متن کاملFeature Compensation for Speech Recognition in Severely Adverse Environments Due to Background Noise and Channel Distortion
This paper proposes an effective feature compensation scheme to address severely adverse environments for robust speech recognition, where background noise and channel distortion are simultaneously involved. An iterative channel estimation method is integrated into the framework of our Parallel Combined Gaussian Mixture Model (PCGMM) based feature compensation algorithm [1]. A new speech corpus...
متن کامل